49 research outputs found

    Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems

    Get PDF
    Crowdsourcing systems commonly face the problem of aggregating multiple judgments provided by potentially unreliable workers. In addition, several aspects of the design of efficient crowdsourcing processes, such as defining worker's bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. Bringing this together, in this work we introduce a new time--sensitive Bayesian aggregation method that simultaneously estimates a task's duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, builds on the key insight that the time taken by a worker to perform a task is an important indicator of the likely quality of the produced judgment. To capture this, BCCTime uses latent variables to represent the uncertainty about the workers' completion time, the tasks' duration and the workers' accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labeling, such as spammers, bots or lazy labelers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labeling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real-world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task's duration compared to state-of-the-art methods

    Возрастные особенности площади лимфоидных узелков стенки глотки человека

    Get PDF
    ГЛОТКА /АНАТОМГОРЛО /АНАТОМВОЗРАСТНЫЕ ГРУППЫЛИМФОИДНАЯ ТКАНЬ /АНАТОМЛИМФАДЕНОИДНАЯ ТКАНЬ /АНАТОМЛИМФОРЕТИКУЛЯРНАЯ ТКАНЬ /АНАТОМНОВОРОЖДЕННЫЙМЛАДЕНЕЦ НОВОРОЖДЕННЫЙМЛАДЕНЕЦДЕТИ РАННЕГО ВОЗРАСТАДЕТИПОДРОСТКИЛИЦА ЮНОШЕСКОГО ВОЗРАСТАВЗРОСЛЫЕПРЕСТАРЕЛЫЕСЛИЗИСТАЯ ОБОЛОЧКА /АНАТОМПЛОЩАДЬ ЛИМФОИДНЫХ УЗЕЛКОВ СТЕНКИ ГЛОТКИЛИМФОИДНЫЕ УЗЕЛКИ СТЕНКИ ГЛОТКИВОЗРАСТНЫЕ ОСОБЕННОСТ

    Ultrasound-enhanced latex immunoagglutination and PCR as complementary methods for non-culture-based confirmation of meningococcal disease

    Get PDF
    Preadmission administration of antibiotics to patients with suspected meningococcal infection has decreased the likelihood of obtaining an isolate and has stimulated development of rapid and reliable non-culture-based diagnostic methods. The sensitivity of the conventional test card latex agglutination test (TCLAT) for detection of capsular polysaccharide has been reported to be suboptimal. In the United Kingdom meningococcal DNA detection by PCR has become readily available and is now used as a first-line investigation. Recently, the performance of latex antigen detection has been markedly improved by ultrasound enhancement. Three tests for laboratory confirmation of meningococcal infection, (i) PCR assays, (ii) TCLAT, and (iii) ultrasound-enhanced latex agglutination test (USELAT), were compared in a retrospective study of 125 specimens (serum, plasma, and cerebrospinal fluid specimens) from 90 patients in whom meningococcal disease was suspected on clinical grounds. Samples were from patients with (i) culture-confirmed meningococcal disease, (ii) culture-negative but PCR-confirmed meningococcal disease, and (iii) clinically suspected but non-laboratory-confirmed meningococcal disease. USELAT was found to be nearly five times more sensitive than TCLAT. Serogroup characterization was obtained by both PCR and USELAT for 44 samples; all results were concordant and agreed with the serogroups determined for the isolates when the serogroups were available. For 12 samples negative by USELAT, the serogroup was determined by PCR; however, for 12 other specimens for which PCR had failed to indicate the serogroup, USELAT gave a result. USELAT is a rapid, low-cost method which can confirm a diagnosis, identify serogroups, and guide appropriate management of meningococcal disease contacts. A complementary non-culture-based confirmation strategy of USELAT for local use supported by a centralized PCR assay service for detection of meningococci would give the benefits of timely information and improved epidemiological data

    Cancer risk in 680 000 people exposed to computed tomography scans in childhood or adolescence: Data linkage study of 11 million Australians

    Get PDF
    Objective To assess the cancer risk in children and adolescents following exposure to low dose ionising radiation from diagnostic computed tomography (CT) scans. Design Population based, cohort, data linkage study in Australia. Cohort members 10.9 million people identified from Australian Medicare records, aged 0-19 years on 1 January 1985 or born between 1 January 1985 and 31 December 2005; all exposures to CT scans funded by Medicare during 1985-2005 were identified for this cohort. Cancers diagnosed in cohort members up to 31 December 2007 were obtained through linkage to national cancer records. Main outcome Cancer incidence rates in individuals exposed to a CT scan more than one year before any cancer diagnosis, compared with cancer incidence rates in unexposed individuals. Results 60 674 cancers were recorded, including 3150 in 680 211 people exposed to a CT scan at least one year before any cancer diagnosis. The mean duration of follow-up after exposure was 9.5 years. Overall cancer incidence was 24% greater for exposed than for unexposed people, after accounting for age, sex, and year of birth (incidence rate ratio (IRR) 1.24 (95% confidence interval 1.20 to 1.29); P<0.001). We saw a dose-response relation, and the IRR increased by 0.16 (0.13 to 0.19) for each additional CT scan. The IRR was greater after exposure at younger ages (P<0.001 for trend). At 1-4, 5-9, 10-14, and 15 or more years since first exposure, IRRs were 1.35 (1.25 to 1.45), 1.25 (1.17 to 1.34), 1.14 (1.06 to 1.22), and 1.24 (1.14 to 1.34), respectively. The IRR increased significantly for many types of solid cancer (digestive organs, melanoma, soft tissue, female genital, urinary tract, brain, and thyroid); leukaemia, myelodysplasia, and some other lymphoid cancers. There was an excess of 608 cancers in people exposed to CT scans (147 brain, 356 other solid, 48 leukaemia or myelodysplasia, and 57 other lymphoid). The absolute excess incidence rate for all cancers combined was 9.38 per 100 000 person years at risk, as of 31 December 2007. The average effective radiation dose per scan was estimated as 4.5 mSv. Conclusions The increased incidence of cancer after CT scan exposure in this cohort was mostly due to irradiation. Because the cancer excess was still continuing at the end of follow-up, the eventual lifetime risk from CT scans cannot yet be determined. Radiation doses from contemporary CT scans are likely to be lower than those in 1985-2005, but some increase in cancer risk is still likely from current scans. Future CT scans should be limited to situations where there is a definite clinical indication, with every scan optimised to provide a diagnostic CT image at the lowest possible radiation dose

    Developmental Profiles of Eczema, Wheeze, and Rhinitis: Two Population-Based Birth Cohort Studies

    Get PDF
    The term "atopic march" has been used to imply a natural progression of a cascade of symptoms from eczema to asthma and rhinitis through childhood. We hypothesize that this expression does not adequately describe the natural history of eczema, wheeze, and rhinitis during childhood. We propose that this paradigm arose from cross-sectional analyses of longitudinal studies, and may reflect a population pattern that may not predominate at the individual level.Data from 9,801 children in two population-based birth cohorts were used to determine individual profiles of eczema, wheeze, and rhinitis and whether the manifestations of these symptoms followed an atopic march pattern. Children were assessed at ages 1, 3, 5, 8, and 11 y. We used Bayesian machine learning methods to identify distinct latent classes based on individual profiles of eczema, wheeze, and rhinitis. This approach allowed us to identify groups of children with similar patterns of eczema, wheeze, and rhinitis over time. Using a latent disease profile model, the data were best described by eight latent classes: no disease (51.3%), atopic march (3.1%), persistent eczema and wheeze (2.7%), persistent eczema with later-onset rhinitis (4.7%), persistent wheeze with later-onset rhinitis (5.7%), transient wheeze (7.7%), eczema only (15.3%), and rhinitis only (9.6%). When latent variable modelling was carried out separately for the two cohorts, similar results were obtained. Highly concordant patterns of sensitisation were associated with different profiles of eczema, rhinitis, and wheeze. The main limitation of this study was the difference in wording of the questions used to ascertain the presence of eczema, wheeze, and rhinitis in the two cohorts.The developmental profiles of eczema, wheeze, and rhinitis are heterogeneous; only a small proportion of children (∼ 7% of those with symptoms) follow trajectory profiles resembling the atopic march. Please see later in the article for the Editors' Summary

    Language Understanding in the Wild: Combining Crowdsourcing and Machine Learning

    No full text
    Social media has led to the democratisation of opinion sharing. A wealth of information about public opinions, current events, and authors’ insights into specific topics can be gained by understanding the text written by users. However, there is a wide variation in the language used by different authors in different contexts on the web. This diversity in language makes interpretation an extremely challenging task. Crowdsourcing presents an opportunity to interpret the sentiment, or topic, of free-text. However, the subjectivity and bias of human interpreters raise challenges in inferring the semantics expressed by the text. To overcome this problem, we present a novel Bayesian approach to language understanding that relies on aggregated crowdsourced judgements. Our model encodes the relationships between labels and text features in documents, such as tweets, web articles, and blog posts, accounting for the varying reliability of human labellers. It allows inference of annotations that scales to arbitrarily large pools of documents. Our evaluation shows that by efficiently exploiting language models learnt from aggregated crowdsourced labels, we can provide up to 25% improved classifications when only a small portion, less than 4% of documents has been labelled. Compared to the six state-of-the-art methods, we reduce by up to 67% the number of crowd responses required to achieve comparable accuracy. Our method was a joint winner of the CrowdFlower - CrowdScale 2013 Shared Task challenge at the conference on Human Computation and Crowdsourcing (HCOMP 2013)

    Bayesian inference for Plackett-Luce ranking models

    No full text
    This paper gives an efficient Bayesian method for inferring the parameters of a Plackett-Luce ranking model. Such models are parameterised distributions over rankings of a finite set of objects, and have typically been studied and applied within the psychometric, sociometric and econometric literature. The inference scheme is an application of Power EP (expectation propagation). The scheme is robust and can be readily applied to large scale data sets. The inference algorithm extends to variations of the basic Plackett-Luce model, including partial rankings. We show a number of advantages of the EP approach over the traditional maximum likelihood method. We apply the method to aggregate rankings of NASCAR racing drivers over the 2002 season, and also to rankings of movie genres. 1

    Abstract

    No full text
    We address the problem of learning to rank based on a large feature set and a training set of judged documents for given queries. Recently there has been interest in using IR evaluation metrics to assist in training ranking functions. However, direct optimization of an IR metric such as NDCG with respect to model parameters is difficult because such a metric is non-smooth with respect to document scores. Recently Taylor et al. presented a method called SoftRank which smooths a metric such as NDCG by introducing uncertainty into the scores, thus making it amenable for optimization. In this paper we extend SoftRank by combining it with a Gaussian process (GP) model for the ranking function. The advantage is that the SoftRank smoothing uncertainties are naturally supplied by the GP, reflecting the underlying modelling uncertainty in individual document scores. We can also use these document uncertainties to rank differently, depending on how risky or conservative we want to make the ranking. We test our method on the publicly available LETOR OHSUMED data set and show very competitive results. 1 IR metrics Our task in information retrieval is to choose and present, possibly in an ordered list, a set of documents relevant to the query entered by a user. In order to design and improve retrieval systems we need to evaluate the quality of the retrieved results. In practice this is usually done using a set of judged documents for multiple queries. The documents are judged for relevance to the query either using binary labels or a graded scale. An IR metric or utility is a function of the judged labels for a returned set of documents. Many such metrics have been developed (see e.g. [1]) to try to capture various aspects of user preference for the retrieved set. For example, in a ranked list a user presumably pays most attention to the head of the list, and we therefore want to make sure we get the very relevant documents right at the top. 1.1 NDCG Normalized Discounted Cumulative Gain (NDCG) [2] is an IR metric that is a function of graded relevance labels, typically 0 (bad) to 4 (perfect), which focuses on the top of a ranking using a discount function. It is defined as: GR = G −
    corecore